Tute - Completed (Week 7)

1. Abstract machines
2. Safety and Liveness
- 2.1. Decomposing Properties
- 2.2. Type Safety
3. Hindsight

1. Abstract machines

Recall the abstract machine from the Week 5 tutorial.

1.1. The E Machine

Evaluate the following expression using E-machine rules given in the lecture (where $\mathtt{Num}$ is abbreviated to $\mathtt{N}$):

\[\small (\mathtt{Apply}\ (\mathtt{Fun}\ (f.x. \mathtt{If}\ (\mathtt{Less}\ x\ (\mathtt{N}\ 2))\ (\mathtt{N}\ 0)\ (\mathtt{Apply}\ f\ (\mathtt{Minus}\ x\ (\mathtt{N}\ 1)))))\ (\mathtt{N}\ 3))\]

You don't need to write down every single step, but show how the contents of the stack and environments change during evaluation.

Feel free to skip even more steps of evaluation here. Just noting down the pattern of environments sitting on the stack that emerges, which is relevant to the third question.

\[\scriptsize\begin{array}{cl} & \circ\ \vert\ \bullet \succ (\mathtt{Apply}\ (\mathtt{Fun}\ (f.x.\ (\mathtt{If}\ (\mathtt{Less}\ x\ (\mathtt{N}\ 2))\ (\mathtt{N}\ 0)\ (\mathtt{Apply}\ f\ (\mathtt{Minus}\ x\ (\mathtt{N}\ 1))))))\ (\mathtt{N}\ 3))\\ \mapsto_E & (\texttt{Apply}\ \square\ (\texttt{N}\ 3)) \triangleright \circ\ \vert\ \bullet \succ (\texttt{Fun}\ (f.x.\ (\texttt{If}\ (\texttt{Less}\ x\ (\texttt{N}\ 2))\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))))))\\ \mapsto_E & (\texttt{Apply}\ \square\ (\texttt{N}\ 3)) \triangleright \circ\ \vert\ \bullet \prec \langle\!\langle\bullet ; f.x.\ \cdots \rangle\!\rangle\\ \mapsto_E & (\texttt{Apply}\ \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle\ \square) \triangleright \circ\ \vert\ \bullet \succ (\texttt{N}\ 3)\\ \mapsto_E & (\texttt{Apply}\ \langle\!\langle\bullet ; f.x.\ \cdots \rangle\!\rangle\ \square) \triangleright \circ\ \vert\ \bullet \prec 3\\[0.5em] & \text{Let $\eta_1$ be the environment: } f = \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle, x = 3, \bullet\\[0.5em] \mapsto_E & \bullet \triangleright \circ\ \vert\ \eta_1 \succ (\texttt{If}\ (\texttt{Less}\ x\ (\texttt{N}\ 2))\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))))\\ \mapsto_E & (\texttt{If}\ \square\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))) \triangleright \bullet \triangleright \circ\ \vert\ \eta_1 \succ (\texttt{Less}\ x\ (\texttt{N}\ 2))\\[0.25em] & \text{Skipping some steps...}\\[0.25em] \mapsto_E & (\texttt{If}\ \square\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))) \triangleright \bullet \triangleright \circ\ \vert\ \eta_1 \prec \texttt{False}\\ \mapsto_E & \bullet \triangleright \circ\ \vert\ \eta_1 \succ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))\\ \mapsto_E & (\texttt{Apply}\ \square\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))) \triangleright \bullet \triangleright \circ\ \vert\ \eta_1 \succ f\\ \mapsto_E & (\texttt{Apply}\ \square\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))) \triangleright \bullet \triangleright \circ\ \vert\ \eta_1 \prec \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle\\ \mapsto_E & (\texttt{Apply}\ \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle\ \square) \triangleright \bullet \triangleright \circ\ \vert\ \eta_1 \succ (\texttt{Minus}\ x\ (\texttt{N}\ 1))\\[0.25em] & \text{Skipping some steps...}\\[0.25em] \mapsto_E & (\texttt{Apply}\ \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle\ \square) \triangleright \bullet \triangleright \circ\ \vert\ \eta_1 \prec 2\\ \end{array}\]

\[\scriptsize\begin{array}{cl} & \text{Let $\eta_2$ be the environment: } f = \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle, x = 2, \bullet\\[0.5em] \mapsto_E & \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \succ (\texttt{If}\ (\texttt{Less}\ x\ (\texttt{N}\ 2))\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))))\\ \mapsto_E & (\texttt{If}\ \square\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))) \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \succ (\texttt{Less}\ x\ (\texttt{N}\ 2))\\[0.25em] & \text{Skipping some steps...}\\[0.25em] \mapsto_E & (\texttt{If}\ \square\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))) \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \prec \texttt{False}\\ \mapsto_E & \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \succ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))\\ \mapsto_E & (\texttt{Apply}\ \square\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))) \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \succ f\\ \mapsto_E & (\texttt{Apply}\ \square\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))) \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \prec \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle\\ \mapsto_E & (\texttt{Apply}\ \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle\ \square) \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \succ (\texttt{Minus}\ x\ (\texttt{N}\ 1))\\[0.25em] & \text{Skipping some steps...}\\[0.25em] \mapsto_E & (\texttt{Apply}\ \langle\!\langle\bullet ; f.x.\ \cdots\triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \prec 1\\[0.5em] & \text{Let $\eta_3$ be the environment: } f = \langle\!\langle\bullet ; f.x.\ \cdots\rangle\!\rangle, x = 1, \bullet\\[0.5em] \mapsto_E & \eta_2 \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_3 \succ (\texttt{If}\ (\texttt{Less}\ x\ (\texttt{N}\ 2))\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1))))\\ \mapsto_E & (\texttt{If}\ \square\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))) \triangleright \eta_2 \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_3 \succ (\texttt{Less}\ x\ (\texttt{N}\ 2))\\[0.25em] & \text{Skipping some steps...}\\[0.25em] \mapsto_E & (\texttt{If}\ \square\ (\texttt{N}\ 0)\ (\texttt{Apply}\ f\ (\texttt{Minus}\ x\ (\texttt{N}\ 1)))) \triangleright \eta_2 \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_3 \prec \texttt{True}\\ \mapsto_E & \eta_2 \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_3 \succ (\texttt{N}\ 0)\\ \mapsto_E & \eta_2 \triangleright \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_3 \prec 0\\ \mapsto_E & \eta_1 \triangleright \bullet \triangleright \circ\ \vert\ \eta_2 \prec 0\\ \mapsto_E & \bullet \triangleright \circ\ \vert\ \eta_1 \prec 0\\ \mapsto_E & \circ\ \vert\ \bullet \prec 0\\ \end{array}\]

1.2. Closures

Give an example of a program that evaluates to different results in an environment semantics, depending on whether or not closures are used. You may use the MinHS dialect used in your assignment.

\[\mathbf{let}\ x = 2\ \mathbf{in}\ (\mathbf{let}\ x = 3\ \mathbf{in}\ (\mathbf{recfun}\ f\ y = x))\ 1\]

Evaluates to 3 with closures, and 2 without.

1.3. Tail Calls

Consider the following two programs:

sumInt :: Int -> Int
sumInt 0 = 0
sumInt n = n + (sumInt (n-1))

sumInt' :: Int -> Int
sumInt' n = sum2 0 n
  where
    sum2 m 0 = m
    sum2 m n = sum2 (m+n) (n-1)

In a strict (call-by-value) language, the first program would lead to a stack overflow if applied to a sufficiently large number, but the second would not.

Explain how tail-call elimination can be applied to these examples.
How can we change the definition of our E-Machine to optimize tail-calls?
In Haskell, the second version will still exhaust all available memory if applied to a sufficiently large number. Why is this the case?

1: The first example is not tail-recursive, in that the last computation done by the function body is an addition, not a recursive function call.

In the second example, the last thing performed by the inner computation is a recursive function call, which means the stack frame allocated can be re-used for the function call, without allocating any more memory.

2: The pattern produced by tail calls in the E-Machine is a series of environments sitting on the stack with no intermediate computations, as in the above example.

Changing the rule for function application to include a special case when an environment is already on the stack (and not saving the current environment in that case) just leaves the environment that was first pushed onto the stack, and doesn't bother to save the rest:

\[ \small \dfrac{}{ (\mathtt{Apply}\ \langle\!\langle \eta_f; f.x.\ e \rangle\!\rangle\ \square) \triangleright \eta_s \triangleright s\ \vert\ \eta \prec v \mapsto_E \eta_s \triangleright s\ \vert\ f=\langle\!\langle \cdots \rangle\!\rangle, x=v, \eta_f \succ e } \]

3: Haskell, being lazily evaluated, does not bother to evaluate the accumulator parameter until it is pattern matched on:

sum2 0 10 = sum2 (0 + 10) (10-1)
          = sum2 (0 + 10) 9          -- evaluated from pattern match
          = sum2 (0 + 10 + 9) (9-1) 
          = sum2 (0 + 10 + 9) 8      -- pattern match
          = sum2 (0 + 10 + 9 + 8) (8-1)

As can be seen, the accumulator is constantly growing in size, as it is never matched on.

GHC can be fooled into evaluating the accumulator by doing a pointless pattern match:

sum2 0 0 = 0
sum2 m 0 = 0 
sum2 m n = sum2 (m + n) (n - 1)

But a more idiomatic way is to use BangPatterns:

sum2 !m 0 = 0 
sum2 !m n = sum2 (m + n) (n - 1)

Or, if extensions are not desired, the seq combinator, which forces evaluation of the left hand side before returning the right:

sum2 m 0 = 0 
sum2 m n = let m' = m + n 
            in m' `seq` sum2 (m + n) (n - 1)

2. Safety and Liveness

a) Type Safety is said to be a safety property, and termination is said to be a liveness property. Suppose I was impatient, and stated a stronger termination property as follows:

The process will terminate and halt execution within a billion steps.

Is this version still a liveness property?

No, it is a safety property, as a violation can be observed after a finite amount of steps.

b) What are the properties of progress and preservation? Are they safety or liveness properties?

Progress states that well-typed terms are not stuck. Preservation states that well-typed terms evaluate to terms that have the same type.

Both of these properties are safety properties, as they comprise the type safety property.

2.1. Decomposing Properties

For each of the following program properties, decompose it into the intersection of a safety property and a liveness property.

The program will allocate exactly 100MB of memory.
The program will not overflow the stack.
The program will return the sum of its inputs.
The program will call the function foo.

The safety component is that the program will not allocate more than 100MB of memory. The liveness component is that the program will allocate at least 100MB of memory.
The safety component is that the program will not overflow the stack. The liveness component is the set of all behaviours.
The liveness component is that the program will return. The safety component is that if the program returns, it will return the sum of its inputs.
The safety component is the set of all behaviours. The liveness component is that the function foo will be called.

2.2. Type Safety

Why is the handling of partial functions important for a type safe language? How does it impact progress and preservation?

In order to satisfy progress, we need a successor state for every well-typed expression, including invocations of partial functions. Thus, we must introduce an error state, that is a final state, that is the result of evaluating a partial function outside of its domain.

In order to satisfy preservation, we allow error to take on any type at all.

3. Hindsight

This question is about the hindsight language from Assignment 1.

Write a recursive hindsight function (recfun expression) with type Int -> F Int, that computes the n:th triangular number. C.f. the sumInt Haskell function above

recfun sumInt :: (Int -> F Int) n =
  reduce n == 0
  to b in
    if b then
      produce 0
    else
      reduce n - 1
      to m in
        reduce force sumInt m
        to t in n + t

Now consider the auxiliary sum2 function above. If we transcribe this into a hindsight function of type Int -> Int -> F Int we will be unable to mimic the evaluation order used by Haskell. Why? And what type signature would we use if we wanted to mimic the Haskell evaluation?

Haskell will not evaluate any of the arithmetic on the accumulator, since sum2 creates no demand for its value (see previous answers). Thus all the arithmetic will be deferred until we hit the base case. In hindsight, such deferred computation must be explicit: we must put the unevaluated arithmetic computation in a thunk, and pass the thunk around in the accumulator. But the accumulator is an Int, and a deferred Int-producing computation would have type U(F Int).

An implementation with Haskell's evaluation order would instead need to have type Int -> U(F Int) -> F Int.
When we came up with hindsight, by far the most contentious issue was the treatment of Cons. It's a bit annoying that you have to force it. Why do you think it is the way it is? What are the alternatives?
In MinHS, the types are Cons :: Int -> [Int] -> [Int]. Nil :: [Int]. That is, Cons has a function type and Nil does not. But in hindsight, [Int] is a value type, and a function is a computation type. This creates a tension that we considered three options for resolving:
1. Uniformly treat all constructors as values. This requires treating Cons as a suspended computation, giving it the type U(Int -> [Int] -> F [Int]).
2. Give up on the uniform treatment of constructors, and introduce a distinction between value constructors (Nil, True and , False) and computation constructors (Cons). This allows us to assign Cons the more natural type Int -> [Int] -> F [Int], and Cons 5 Nil becomes a valid expression. However, Cons 5 (Cons 4 Nil) would not be valid (why?).
3. Uniformly treat all constructors as computations, yielding Cons :: Int -> [Int] -> F [Int] and e.g. Nil :: F [Int]. However, Cons 5 Nil is still not well-formed (why?).
We went with the first option. We also briefly considered making the tail suspended, yielding the type U(Int -> U(F [Int]) -> F [Int]).